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Abstract:  An  operations  research  problem  concerning  the  optimal  SAM  firing  pattern 
to  defend  an  aircraft  carrier  is  solved  via  applications  of  the  concept  of  closed-loop  (feed¬ 
back)  and  open-loop  optimal  control.  The  SAM  defense  problem  is  formulated  as  a  Markov 
decision  process  with  the  number  of  SAMs  in  each  salvo  as  the  decision  variable.  In¬ 
teresting  cases,  including  the  presence  of  imperfect  sensor  observation  and  a  bound  on 
the  number  of  SAMs  available,  are  considered.  The  principle  of  dynamic  programming 
and  the  technique  of  nonlinear  integer  programming  are  applied  to  reach  closed-loop  and 
open-loop  solutions.  Numerical  examples  are  given  for  illustration. 


INTRODUCTION 

There  has  been  much  recent  successful  cross-fertilization  between  the  fields  of  optimal 
control  and  operations  research  (1).  Modem  control  theory  has  found  applications  in  solving 
economic  (2),  management  science,  and  resource  allocation  problems  (3).  Pontryagin’s  Maximum 
Principle  of  control  theory  is  generally  the  main  technique  used  in  this  applications.  The  present 
report,  however,  emphasizes  the  concept  of  closed-loop  (feedback)  and  open-loop  optimal  con¬ 
trol  in  solving  the  surface-to-air-missile  (SAM)  defense  problem  for  an  aircraft  carrier  under 
various  sensor  conditions. 

An  air  defense  and  offense  game  model  was  formulated  by  Brodheim  and  others  (4).  They 
considered  the  problem  as  a  two-person  zero-sum  game.  The  problem  treated  in  this  report 
is  different  in  many  aspects.  In  particular:  (a)  Only  defensive  systems  are  of  interest;  the  strat¬ 
egies  oi  the  offensive  are  not  considered,  (b)  The  objective  of  the  defense  is  to  protect  a  ship 
from  enemy  missiles  with  minimum  expected  cost  and  damage  to  the  ship  by  the  enemy  missiles 
which  survive  interception  by  SAMs,  (c)  The  sensor  conditions  are  more  involved;  the  observa¬ 
tions  concerning  the  number  of  enemy  missiles  in  the  attack  are  considered  for  the  following 
cases:  perfect  observation,  imperfect  observation,  and  no  observation,  (d)  The  problem  under 
consideration  is  simpler  than  that  considered  in  Ref.  4,  but  this  simple  defense  mode'  oermits 
much  more  extensive  study  and  analysis. 

The  problem  is  formulated  as  a  Markov  decision  process  with  the  size  of  each  salvo,  the 
number  of  SAMs,  as  the  decision  variable  (or  "control  variable").  Corresponding  to  different 
sensor  conditions,  the  optimal  decisions  are  found  by  applying  the  concepts  of  closed-  and/or 
open-loop  optimal  control.  We  also  consider  the  case  where  the  number  of  SAMs  onboard  is 
limited.  A  Markov  decision  process  of  two-state  variables  is  formed  for  this  case  where  the 
states  are  arranged  in  matrix  form.  The  principle  of  dynamic  programming  and  the  technique 
of  nonlinear  integer  programming  are  used  to  solve  problems  of  this  type. 


NRL  Problem  B0I-10:  Project  RR  003-02-41-6152.  This  is  a  final  report  on  one  phase  of  the  probtem;  work  is  continuing  on  other 
phases.  Manuscript  submitted  October  2.  1970. 
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NOTATION 

Small  letters  c,  g ,  etc.,  represent  vectors  with  elements  denoted  by  Cj,  gj,  etc.,  Capital  let¬ 
ters  F,  etc  ,  represent  matrixes  with  elements  fJk.  Scalars  are  explicitly  mentioned.  The  vector 
c  and  the  matrix  F  at  stage  i  are  denoted  by  c(i)  and  F(i),  respectively.  The  transposition  of 
F  is  denoted  by  Fr. 

MODEL  OF  SAM  DEFENSE  SYSTEM 

It  is  assumed  that  a  group  of  enemy  missiles  (EMs)  is  on  its  way  to  attack  a  ship  which  is 
defended  by  SAMs.  From  the  observations  and  information  concerning  the  speed  and  position 
of  EMs  the  number  of  SAM  salvos  that  can  be  launched  in  time  to  intercept  the  EMs  before  the 
time  of  final  impact  on  the  ship  is  determined  at  initial  time.  We  denote  the  number  by  /.  The 
problem  is  to  choose  the  s.ze  of  each  S  AM  salvo  such  that  an  object  function  is  minimized.  Fur¬ 
ther  assumptions  concerning  this  model  are  listed  as  follows: 

1.  From  radar  output  and  other  sources  of  information,  it  is  assumed  that  the  defense  has 
an  initially  perfect  knowledge  of  the  number  of  EMs,  which  is  denoted  by  j. 

2.  The  EMs  are  assumed  to  arrive  in  a  group  and  the  SAM  salvo  is  aimed  at  this  group.  It 
is  further  assumed  that  one  SAM  can  destroy  at  most  one  EM.  Therefore,  for  instance,  if  the 
probability  of  killing  an  EM  is  q,  then  the  probability  of  killing  two  EMs  from  a  group  of  three, 
when  five  SAMs  are  launched,  is 

Q* 

3.  The  objective  function  of  n  EMs  with  /  SAM  salvos  available  for  launching  can  be  ex¬ 
pressed  by 

/th  h'  i\  /The  expected  cost  \  /The  expected  cost  of  \ 

(function)  =  °f  t0tal  SAMs  t0  ^  +  damage  caused  by  the  . 

\launched  /  \final  impact  with  EMs/ 

A  Markov  Decision  Process 

The  problem  described  is  formulated  into  a  Markov  decision  process.  Let  the  state  variable 
of  the  process  be  the  number  of  EMs  which  survive  the  SAMs’  attack.  The  state  number  cor¬ 
responds  to  the  number  of  EMs  surviving.  For  example,  if  3  EMs  remain  at  a  certain  time,  the 
state  of  the  system  is  state  3.  After  the  next  salvo  of  SAMs,  the  number  of  EMs  surviving  could 
be  0,  I,  2,  or  3,  which  corresponds  to  states  0,  1,2,  or  3.  In  other  words,  the  state  of  the  process 
could  be  transferred  to  state  0,  1,2,  or  3.  The  probabilities  associated  with  these  transitions 
depend  on  the  number  of  SAMs  in  the  salvo,  which  is  called  the  decision  variable.  The  state 
transition  diagram  is  shown  in  Fig.  I.  As  shown  in  Fig.  I,  this  decision  process  has  the  special 
property  that  there  are  no  transitions  from  a  given  state  to  one  of  higher  index.  State  0  is  the 
“terminal  state,"  which  means  that  a  transition  to  state  0  implies  that  the  process  will  remain 
in  state  0.  States  1,  2,  ....  and  n  are  “transient”  states.  tj(i)  as  shown  in  Fig.  1  is  the  number 
of  SAMs  to  be  launched  at  ith  state  in  state  j  (i.e.,  j  EMs  remaining). 

For  convenience  the  index  i,  which  denotes  the  ith  transition  stage,  runs  from  — /  toO,  where 
/  is  defined  previously  as  the  number  of  SAM  salvos  allowed.  Therefore,  the  ( — /)th  stage  is 
the  beginning  stage  and  the  zeroth  stage  is  the  final  stage.  At  the  zeroth  stage,  all  enemy  missiles 
have  reached  the  ship;  no  further  defensive  action  can  be  taken.  The  equations  which  govern 
the  probabilities  of  state  transitions  are 


NRL  REPORT  7210 


3 


io  ( i )  1 1  ( i )  tz{i)  f  .1  ( i )  *»  ( • ) 


Decision 

Variable 

Stale 

Variable 


Po(i  +  1 ) 

1 ,  /o  i  ( i ) ,  /t>2  ( i ) ,  •  • 

•  •  •  •  ,  /on ( i  ) 

Pad) 

Pi(i  +  1 ) 
Pi  ( i  +  1 ) 

0,  /„(»), 

0,  o 

P  i  ( i ) 

Pid) 

P«(i  +  1) 

0 . 

.  .  0,  /„„(*) 

Pnd) 

with  initial  conditions 


Po(-/) 

0 

P,(-/) 

0 

Pi(-I) 

0 

’ 

0 

1 

where  p;(t)  is  the  probability  of  being  in  state  j  at  the  ith  stage,  and  /;*(()  is  the  conditional 
probability  of  transition  from  state  k  to  state  j  at  the  ith  stage.  The  values  of  fjk(i)  are  calculated 
based  on  the  special  characteristics  of  the  process  and  the  preceding  basic  assumptions  about 
the  probability  of  interception; 
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if  j  <  k 


/,*(0=<M  j(l  ~q(i))(  k+i  q(i)k~},  ifj «  k  and j  ^  0  (2) 

l-£/r*(i)  if  ;  =  0 

s.  r~' 

with 

f  -  lj(i), 

where  q(i)  is  a  given  probability  of  hitting  an  EM  by  SAM  and  lj(i)  represents  the  number 
of  SAMs  to  be  launched  at  ith  stage  when  j  EMs  survive. 

Equation  (1)  can  be  written  in  vector-matrix  form: 

p(i  +  1)  =  F(l(i))  p(i);  p(-/)  =  given,  (3) 

where  l(i)  is  the  (n-F  1) -dimensional  decision  vector  where  elements  are  ij(i),  for,/'  =  0 . n. 


Markov  Process  with  Cost 

The  cost  function  described  previously  can  be  associated  with  this  Markov  process.  Define 
cj(i)  =  the  expected  total  cost  from  ith  stage  to  the  end  of  the  process,  if  the  system  is  now 
in  state  j,  given  tj  ( m ) ,  for  m  =  i —  I .  The  expected  cost  includes  the  cost  of  SAMs  launched 
and  the  cost  of  damage  to  the  ship  by  surviving  EMs. 

Based  on  this  definition, 

Cj(i)  —  ^j(i)  +  t(m)Tp(m)  j  6  +  p(0)Tg\  for  /  = -/ . — 1,  (4) 

where  the  first  term  represents  the  expected  cost  of  the  total  number  of  SAMs  launched,  and  b, 
a  scalar,  is  the  cost  of  each  individual  SAM.  The  second  term  represents  the  expected  cost  of 
the  terminal  damage  on  the  ship  and  the  vector  g  is  an  (n  +  1)  vector  whose  element  gk  gives  the 
expected  cost  of  damage  to  the  ship,  should  k  enemy  missiles  survive  all  SAMs’  attack. 

Using  the  state  transition  equation  (Eq.  (3)),  a  recurrence  relation  of  vector  c(i)  is  derived. 
Aftei  manipulations,  we  have 

c(i)  =  bt(i)  +  F(l(i))Tc(i+  1);  fori  =  -1,-2 . -/,  (5) 

with 

c(0)  =  g.  (6) 

F(t(i))  is  the  tiansmon  matrix  given  in  Eq.  (1).  Equation  (5)  is  the  key  equation  in  this  report. 

Howard  15)  has  formulated  an  economic  decision  process  and  has  made  a  significant  con¬ 
tribution  in  finding  the  optii  ial  decisions  in  the  steady  state.  The  process  treated  in  this  report 
is  different  from  Howard's  in  that  the  terminal  cost  contributes  a  great  part  of  the  total  cost, 
and  the  special  properties  as  indicated  by  Fig.  1  make  the  problem  of  finding  the  optimal  lj(i ) 
computationally  feasible  even  if  the  number  of  states  n  is  large  (say  50). 
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Closed-  and  Open-Loop  Controls 

In  the  following  sections,  the  problems  of  finding  optimal  vector  l(i)  for  i  —  —I,  —  1 

are  considered  under  different  sensor  observations.  The  concept  of  open-  and  closed-loop  (feed¬ 
back)  control  is  extensively  applied  to  this  problem.  For  clarity  we  briefly  discuss  the  concept 
here.  In  the  closed-loop  control  of  Fig.  2,  the  output  of  the  Markov  process  j  is  observed  at  t th 
stage.  The  observed  results  are  used  to  make  a  decision  on  l,  ( i )  which  in  turn  drives  the  Markov 
process.  The  whole  thing  forms  a  closed  loop.  In  the  open-loop  control,  the  size  of  salvos,  scalar 
a(i)  for  i  =  —I,  ...,  —1,  is  predetermined.  No  decision-making  is  involved.  The  open-loop 
control  is  applied  when  the  observations  are  not  available.  The  closed-loop  control  is  a  more 
sophisticated  algorithm  than  the  open-loop  control,  and  it  will  yield  the  smallest  possible  ex¬ 
pected  cost  for  the  decision  process.  In  this  report,  an  additional  case  which  is  called  semiclosed 
loop  is  introduced.  It  is  interesting  to  compare  the  numerical  results  of  these  three  cases. 


to  be  launched 


(a)  Closed-loop  control 


(b)  Open-loop  control 


Kig.  2  —  Closed-  and  open-loop  control 


6 


OPTIMAL  SAM  DEFENSE  SYSTEM 


Case  I.  Perfect  Observation  I Closed-Loop  Control)* 

In  this  case  the  defense  has  perfect  knowledge  of  the  number  of  EMs  remaining  at  every 
moment  (i.e.,  at  all  i).  In  other  words,  the  decision  maker  has  perfect  knowledge  of  what  the 
state  of  the  system  is.  From  the  principle  of  dynamic  programming,  the  optimal  i(i),  denoted  by 
l*(i),  can  be  found  iteratively  from 

c*(i)  =  min  {6/(i) +  F’(i(/))rc*(i  +  1)};  c*(0)=g  (7) 

1(0  for  i  =  —  1,  —  2,  ... ,  —  /, 

where  Eq.  (7)  is  obtained  directly  from  Eq.  (5)  and  c*(0  is  the  optimal  cost  function  (orthe 
optimal  return  function  in  control  theory).  The  optimization  of  Eq.  (7)  is  carried  out  backwards, 
stage  by  stage  and  state  by  state.  In  more  detail,  at  each  stage,  there  are  n  optimal  lj(i)  to  be 
chosen  to  minimize  the  corresponding  n  cost  functions  cj(i).  Since  £j(i)  has  to  be  a  positive 
integer  and  F(l(i))  is  a  nonlinear  function  of  tj(i),  this  constitutes  a  nonlinear  integer  pro¬ 
gramming  problem.  Numerical  results  of  i*(i)  are  found  by  assuming  that  cj(i)  has  a  single 
relative  minimum.  This  assumption  is  intuitively  reasonable  since  the  cost  function  shows  a 
tradeoff  between  the  cost  of  SAMs  and  the  cost  of  damage  suffered  by  the  ship.  If  this  assumption 
is  not  true,  a  process  of  choosing  the  absolute  minimum  out  of  a  finite  set  of  relative  minima  has 
to  be  taken. 

It  should  be  noted  that  the  concept  of  closed-loop  control  (feedback  control)  has  been  applied 
to  the  problem.  The  optimal  closed-loop  decisions,  where  the  state  is  assumed  known  when 
the  decision  is  made,  yield  the  smallest  possible  expected  cost  for  this  Markov  decision  process. 
This  smallest  expected  cost  serves  as  an  upper  bound  for  all  the  cases  to  be  discussed  in  the 
subsequent  sections. 

A  numerical  example  is  given  below  to  illustrate  how  the  optimal  decisions  are  chosen  at 
each  stage.  The  ship  has  perfect  observations  at  all  times.  The  important  data  concerning  this 
example  are  given  as  follows: 

The  probability  of  destroying  a  missile,  q(i)  =  0.632,  for  all  i. 

The  cost  of  a  single  SAM,  6  =  1  unit. 

The  cost  of  ship  damage  by  j  EMs,  gj=j'x  100  units  for  j  =  0 . n. 

The  number  of  enemy  missiles  in  the  raid,  n  =  10. 

The  maximum  number  of  SAM  salvos  that  can  be  launched,  /  =  8. 

A  list  of  optimal  l*  (()  and  cf(i)  for;  =  1, ... ,  10  and  t  — — 8,  — 7 . 0,  are  shown  in  Table  I. 

In  real-tirne  applications.  If  (i)  is  chosen  at  every  stage  by  the  state  of  the  system.  For  example, 
if  i  —  —  3,  and  j  =  9,  from  the  table, 


If  (-3)  =  12, 
c*(-3)  =  15  662. 

This  means  that  1 2  SAMs  is  the  optimal  decision  at  i  =  3  when  9  EMs  remain  and  the  asso¬ 
ciated  optimal  cost  from  i  =  —3  to  the  end  is  1 5.662.  At  i  =  0,  since  there  is  no  time  for  further 
defense,  the  cost  is  g)  if  j  enemy  missiles  are  left.  Once  the  process  reaches  the  state  0  (no  EM  left), 
the  process  is  terminated;  no  SAM  is  to  be  fired  and  the  expected  cost  is  0. 

Remarks.  1.  The  optimal  closed-loop  decision  If  ( i )  does  not  determine  the  state  of  the  sys¬ 
tem  at  (i  +  1)  stage,  but  it  does  determine  the  probability  of  the  state  occurring  at(i+  I )  stage. 

2.  Table  I  can  be  applied  to  the  situation  where  n  *£  10  and  /  *s  8  because  of  the  special 
characteristics  of  this  decision  process. 


•The  solution  to  this  problem  for  the  special  case  of  one  EM  has  been  obtained  independently  by  D  Kaplan  (6)  using  a  different  method. 


! 
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Table  1 

Optimal  Closed-l  oop  Solution 


State 
of  EMs 

O') 

Stage  (i) 

-8 

-7 

-6 

-5 

-4 

-.3 

-2 

-1 

0 

1 

1 

1.5852 

1 

1.5903 

1 

1.6040 

1 

1.6410 

1 

1.7429 

1 

2.0188 

2 

2.7685 

5 

5.6749 

100 

2 

2 

3.1704 

2 

3.1805 

2 

3.2078 

2 

3.2811 

2 

3.4736 

2 

3.9469 

4 

4.8668 

7 

8.2618 

200 

3 

3 

4.7708 

3 

4.81 14 

3 

4.9188 

3 

5.1844 

4 

5.6364 

5 

6.6820 

9 

10.734 

300 

4 

4 

6.3409 

4 

6.3610 

4 

6.41 17 

4 

6.5583 

5 

6.8520 

5 

7.3602 

7 

8.5181 

12 

12.968 

400 

5 

5 

7.9261 

5 

7.9511 

5 

8.0177 

5 

8.1854 

6 

8.5772 

7 

9.0403 

14 

15.111 

500 

6 

6 

9.5112 

6 

9.5412 

6 

9.6203 

7 

9.7967 

7 

10.116 

8 

10.702 

10 

12.050 

16 

17.210 

600 

7 

7 

11.096 

7 

11.131 

m 

8 

11.400 

9 

11.751 

10 

12.378 

12 

13.829 

18 

19.273 

700 

8 

8 

12.682 

8 

12.721 

9 

12.816 

9 

13.010 

10 

13.359 

II 

14.007 

w9 

20 

21.307 

Q 

9 

14.267 

9 

14.311 

10 

14.408 

II 

14.620 

II 

14.979 

12 

15.662 

15 

17.243 

22 

23.318 

to 

10 

15.852 

1 1 

15.898 

1 1 

16.003 

12 

16.214 

13 

16.596 

14 

17.289 

16 

18.930 

24 

25.31) 

1000 

3.  It  is  interesting  to  note  in  Table  1  '.hat  l*  (i)  — »  j  as  i  becomes  a  negative  large  number. 
This  is  essentially  a  steady-state  optimal  decision  when  the  process  is  at  a  stage  which  is  very 
far  away  from  the  end.  The  proof  is  straightforward  (5)  and  is  not  given  here. 

4.  The  computer  time  consumed  in  computing  Table  I  is  30  sec  on  the  CDC  3800.  It  takes 
about  5  min  to  compute  a  similar  table  for  n  =  50  and  /  =  20. 

Case  II.  Imperfect  Observation  (Semiclosed-Loop  Control) 

The  closed-loop  solution  as  discussed  in  Case  1  assumes  perfect  observations.  In  the  case  of 
imperfect  observations  on  the  state,  the  closed-loop  solution  cannot  be  applied  since  the  state 
of  the  system  is  not  completely  known.  In  this  section,  it  is  assumed  that  the  sensor  can  only 
determine  whether  there  are  some  EMs  remaining  or  there  is  no  EM  at  all  Ue.,  all  EMs  have 
been  intercepted).  In  other  words,  the  outputs  from  the  sensor  are  either  zero  or  not  zero.  How¬ 
ever,  the  assumption  of  perfect  knowledge  on  the  number  of  EMs  in  the  raid  at  the  initial  time, 
i  =  —I,  is  still  sustained.  The  interpretation  of  this  situation  is  that  at  the  initial  time,  the  defense 
has  observations  f  om  all  information  sources  which  would  provide  a  sufficient  amount  of  data 
for  a  best  estimate  to  the  number  of  EMs,  but  during  the  combat  time,  the  only  observations 
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available  are  given  by  the  radar,  which  can  only  detect  whether  there  are  EMs  or  not.  This  is  a 
practical  assumption  since  if  the  enemy  missiles  are  coming  in  a  group,  they  cluster  so  that  the 
radar  source  cannot  determine  how  many  missiles  are  in  the  raid. 

The  approach  used  in  solving  this  optimization  problem  resembles  that  of  calculus  of  varia¬ 
tions.  The  optimization  cannot  be  carried  out  stage  by  stage,  state  by  state.  Instead  the  optimal 
decisions  for  all  stages  have  to  be  optimally  chosen  simultaneously  to  minimize  the  cost  function. 
The  cost  tunction  as  given  by  Eqs.  (5)  and  (6)  is  still  sustained  except  that 

lj(i)  =  a(i)  for  all  j  /  0,  (8) 

and 

jMO  =  0, 

where  a(i)  is  a  scalar.  The  interpretation  of  Eq.  (8)  is  that  at  the  ith  stage  the  number  of  SAMs 
to  be  launched  is  independent  of  the  state  j  for  j  /  0  since  the  defense  does  not  know  the  state 
of  the  system.  £»(i)  =  0  corresponds  to  the  case  where  the  observation  shows  no  EMs  surviving. 

The  technique  used  in  finding  the  optimal  set  of  a (i)  for  i  =  —I . —1  is  the  iterative 

descent  search  method  under  the  assumption  that  the  cost  function  c„(— /)  has  a  single  relative 
minimum.  An  example  with  the  same  numerical  data  as  given  in  Case  I  is  carried  out  for/=5 
and  n  =  5.  The  optimal  solution  is  listed  in  Table  2.  As  shown  in  Table  2,  at  the  (— 5)th  stage, 
the  defense  should  fire  6  SAMs  and  then  observe  the  result.  If  the  observation  shows  that  there 
are  still  some  EMs  left,  the  defense  should  launch  two  SAMs  at  the  (— 4)th  stage.  It  is  interesting 
to  compare  the  optimal  semiclosed-loop  cost  and  the  closed-loop  cost.  The  optimal  cost  for  the 
closed-loop  policty  is  8. 1854  as  given  by  Table  I .  The  semiclosed-loop  policy  provides  an  optimal 
cost  of  8.5798.  In  other  words,  the  expected  cost  would  be  4.8%  more  if  one  applied  the  optimal 
semiclosed-loop  policy  instead  of  the  closed-loop. 

Remarks.  I.  The  results  in  Table  2  can  be  interpreted  phys¬ 
ically  with  the  assistance  of  Table  1.  At  the  (— 5)th  stage,  a*(— 5) 
is  close  to  IJ(— 5)  =  5,  which  is  the  optimal  solution  in  the  steady 
state.  After  a*(— 5)  has  been  launched,  the  probability  of  two  and 
three  EMs  surviving  is  higher  than  other  states  if  the  system  is  not 
in  zero  state.  Hence  a*(-4)  is  2.  Finally,  at  the  (—  I )  stage,  if  there 
are  still  some  enemy  missiles  left,  intuitively  the  probability  of  one 
EM  being  left  is  much  higher  than  two  or  three.  Therefore, 
a*(— I )  =  6,  which  is  one  more  SAM  than  £*(— I )  =  5  in  Table  1. 

2.  The  assumption  of  a  single  relative  minimum  of  the  cost  func¬ 
tion  can  be  loosely  justified  numerically.  The  optimal  solution 
listed  in  Table  2  is  obtained  by  using  a  descent  search  algorithm 

iteratively  from  an  arbitrary  initial  guess  of  a(i),  for  i  —  -5 . 

—  1 .  Three  different  sets  of  initial  guesses  of  a(i)  have  been  tested, 
and  all  three  sets  converge  to  the  optimal  solution  of  Table  2. 
This  means  that  there  is  a  single  relative  minimum  in  the  domain 
which  is  defined  by  these  sets  of  initial  guesses. 


Table  2 
Semiclosed  Loop 
with  // 5  and  n  =  5 


Variable 

Optimal 

Solution 

c*<— 5) 

8.5798 

«*<— 5) 

6 

a*(-4) 

2 

«*<— 3) 

2 

a*(-2) 

3 

«•<-!) 

6 

Cass  III.  No  Observation  < Open-Loop  Control) 

This  case  is  a  slight  extension  of  Case  II.  The  defense  has  no  observations  at  all  except 
that  the  initial  knowledge  of  the  number  o'  EMs  is  assumed.  This  situation  may  happen  when 
the  radar  fails  to  detect  the  enemy  miss'ies  or  when  the  ship  commander  distrusts  the  radar 
observations.  The  optimal  launching  policy  ior  this  case  is  easier  to  apply  than  that  of  the  closed- 
loop  case,  but  at  the  expense  of  higher  expected  cost. 
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Consider  the  same  numerical  example  as  in  Case  II.  Since  the  probability  of  hitting  an  EM, 
q(i),  is  assumed  independent  of  i,  there  is  no  difference  in  firing  a  SAM  salvo  earlier  or  later. 
The  only  thing  of  importance  is  the  total  number  of  SAMs  launched.  Based  on  this  argument,  it 
is  immediately  found  from  the  ( — 1  )st  stage  of  Table  1  that  for  n  =  5,  the  total  number  of  SAMs 
to  be  launched  is  14,  and  the  optimal  expected  cost  is  15.1 1 1,  which  is  much  higher  than  the 
cost  of  semiclosed  loop,  8.5798.  Distributing  these  14  SAMs  into  any  combination  of  salvos 
does  not  affect  the  expected  cost.  If  q(i)  is  a  function  of  i,  the  solution  is  also  easily  obtained 
from  the  closed-loop  program,  since  intuitively,  the  optimal  launching  policy  is  to  fire  all  SAMs 
concentrated  at  a  single  stage  where  the  probability  of  hitting  q(i)  is  the  highest  of  all. 

Case  IV.  Limited  lumber  of  SAMs 

In  the  preceding  Ihree  cases,  the  process  of  optimization  is  carried  out  without  any  limitation 
on  the  number  of  SAMs  available,  contrary  to  the  practical  situation.  The  optimization  problem 
with  this  limitation  can  be  viewed  both  as  an  optimal  control  problem  with  inequality  control 
constraints  and  as  a  resource  allocation  problem  with  the  resource  being  the  total  number  of  SAMs 
available.  However,  in  the  case  of  perfect  observations,  a  different  concept  can  be  adopted  to 
reach  the  optimal  solution.  Let  the  total  number  of  SAMs  remaining  be  another  state  variable 
denoted  by  m.  Then,  the  optimal  closed-loop  solution  denoted  by  £*m(i)  will  be  a  function  of 
both  state  j,  the  number  of  EMs,  and  state  m,  the  number  of  SAMs  left.  The  transition  of  state  m 
at  any  stage  is  “deterministically”  determined  by  f *  ( i ) .  For  example,  if  j;m(0  ^  launched, 
then  the  system  will  shift  to  state  [m  -  from  state  m.  The  transition  of  state > is  still 

probabilistically  determined  by  l*m  ( i ) . 

The  same  preceding  numerical  example  is  calculated  for  n  =  3,  J  —  3,  and  m  =  8.  The  results 
of  t*m(i)  and  c*m(i)  are  given  in  Table  3.  For  instance,  we  launch  four  SAMs  at  the  ( — 3) rd 
stage  when  three  EMs  are  remaining  and  when  eight  SAMs  are  available.  At  ( — 2)nd  stage,  the 
total  of  SAMs  available  reduces  to  four.  From  the  observations,  if  one  EM  is  left,  we  should 
launch  one  SAM.  The  optimal  process  is  carried  on  in  this  way. 

Remarks.  1.  If  the  total  number  of  SAMs  available  at  j  =  3  and  i  =  —3  is  more  than  18, 
which  is  obtained  from  Table  I  by  adding  9,  5,  and  4,  the  constraint  on  the  total  number  of  SAMs 
does  not  exist.  The  optimal  launching  policy  follows  the  Table  I  of  the  closed-loop  case. 

2.  The  numerical  results  of  Table  3  can  be  interpreted  physically.  For  example,  the  reason 
why  X*3( — 2)  is  2  instead  of  1  is  that  if  we  launch  two  SAMs  at  (— 2)nd  stage,  three  is  a  possibility 
that  both  of  these  EMs  are  intercepted.  Therefore,  at  the  ( —  I )st  stage,  we  have  the  change  to 
save  the  remaining  one  SAM.  However,  if  we  launch  one  instead  of  two,  the  number  of  EMs 
left  at  ( —  I )st  stage  would  be  at  least  one;  therefore,  the  remaining  two  SAMs  are  to  be  fired.  In 
other  words,  there  is  no  possibility  of  saving  a  SAM  if  we  fire  one  instead  of  two. 

3.  The  case  of  imperfect  observation  with  limited  SAMs  can  be  calculated  by  applying  the 
techniques  of  dynamic  programming  (7).  The  case  of  no  observation  with  limited  SAMs  is 
trivial,  following  the  argument  of  the  last  section. 


CONCLUSION 

This  report  demonstrates  that  some  concepts  from  control  theory  can  be  employed  to  solve 
certain  operations  research  problems.  The  author  believes  that  the  model  and  technique  used  in 
this  report  can  be  applied  to  problems  of  various  areas,  such  as  economic  decision  processes 
and  inventory  control.  Other  areas  of  control  theory,  such  as  optimal  estimation,  stochastic 
control,  and  differential  games,  should  find  many  applications  in  operations  research.  For  example, 
arguments  in  the  present  report  can  be  extended  to  a  problem  in  finite  state  stochastic  games  (8), 
when  the  enemy  has  the  option  of  sending  more  missiles. 
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Table  3 

Closed-Loop  Control  with  Limited  SAMs 


Number 
of  EMs 
O') 

Number  of  SAMs  Left  (m) 

1 

2 

3 

4 

5 

6 

7 

8 

THE  (— 3)RD  STAGE 

(0,1) 

(0.1) 

1 

1 

1 

1 

1 

1 

37.800 

14.910 

6.487 

3.472 

2.449 

2.133 

2.026 

2.018 

(0.1) 

(0,1,2) 

(0,1,2) 

2 

2 

2 

2 

2 

137.80 

75.6CO 

38.244 

19.173 

10.304 

6.468 

4.890 

4.242 

(0,1) 

(0,1,2) 

(0,1. 2, 3) 

(0,1, 2,3) 

3 

3 

3 

4 

237.80 

175.60 

113.40 

66.901 

37.737 

21.434 

13.06 

8.957 

THE  (— 2)ND  STAGE 

(0.1) 

1 

1 

1 

2 

2 

2 

2 

37.800 

14.910 

6.719 

3.938 

3.081 

2.790 

2.768 

2.768 

(0.1) 

(0.1,2) 

2 

2 

3 

3 

3 

3 

137.800 

75.600 

38.244 

19.467 

10.758 

6.975 

5.508 

5.062 

1 

(0.1) 

(0,1,2) 

(0, 1,2,3) 

3 

4 

4 

4 

5 

237.800 

175.60 

113.40 

66.901 

37.989 

21.789 

13.539 

9.594 

THE  (- 

DST  STAGE 

1 

2 

3 

4 

5 

5 

5 

5 

1 

37.800 

15.542 

7.983 

5.834 

5.674 

5.674 

5.674 

5.674 

1 

2 

3 

4 

5 

6 

7 

7 

L 

137.800 

75.600 

38.643 

20.266 

12.145 

9.055 

8.286 

8.286 

1 

2 

3 

4 

5 

6 

7 

8 

i 

237.800 

175.60 

1 13.400 

67.154 

38.521 

22.851 

15.132 

11.802 
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